27 research outputs found
A Partition-Based Implementation of the Relaxed ADMM for Distributed Convex Optimization over Lossy Networks
In this paper we propose a distributed implementation of the relaxed
Alternating Direction Method of Multipliers algorithm (R-ADMM) for optimization
of a separable convex cost function, whose terms are stored by a set of
interacting agents, one for each agent. Specifically the local cost stored by
each node is in general a function of both the state of the node and the states
of its neighbors, a framework that we refer to as `partition-based'
optimization. This framework presents a great flexibility and can be adapted to
a large number of different applications. We show that the partition-based
R-ADMM algorithm we introduce is linked to the relaxed Peaceman-Rachford
Splitting (R-PRS) operator which, historically, has been introduced in the
literature to find the zeros of sum of functions. Interestingly, making use of
non expansive operator theory, the proposed algorithm is shown to be provably
robust against random packet losses that might occur in the communication
between neighboring nodes. Finally, the effectiveness of the proposed algorithm
is confirmed by a set of compelling numerical simulations run over random
geometric graphs subject to i.i.d. random packet losses.Comment: Full version of the paper to be presented at Conference on Decision
and Control (CDC) 201
Asynchronous Distributed Optimization over Lossy Networks via Relaxed ADMM: Stability and Linear Convergence
In this work we focus on the problem of minimizing the sum of convex cost
functions in a distributed fashion over a peer-to-peer network. In particular,
we are interested in the case in which communications between nodes are prone
to failures and the agents are not synchronized among themselves. We address
the problem proposing a modified version of the relaxed ADMM, which corresponds
to the Peaceman-Rachford splitting method applied to the dual. By exploiting
results from operator theory, we are able to prove the almost sure convergence
of the proposed algorithm under general assumptions on the distribution of
communication loss and node activation events. By further assuming the cost
functions to be strongly convex, we prove the linear convergence of the
algorithm in mean to a neighborhood of the optimal solution, and provide an
upper bound to the convergence rate. Finally, we present numerical results
testing the proposed method in different scenarios.Comment: To appear in IEEE Transactions on Automatic Contro
A Stochastic Operator Framework for Optimization and Learning with Sub-Weibull Errors
This paper proposes a framework to study the convergence of stochastic
optimization and learning algorithms. The framework is modeled over the
different challenges that these algorithms pose, such as (i) the presence of
random additive errors (e.g. due to stochastic gradients), and (ii) random
coordinate updates (e.g. due to asynchrony in distributed set-ups). The paper
covers both convex and strongly convex problems, and it also analyzes online
scenarios, involving changes in the data and costs. The paper relies on
interpreting stochastic algorithms as the iterated application of stochastic
operators, thus allowing us to use the powerful tools of operator theory. In
particular, we consider operators characterized by additive errors with
sub-Weibull distribution (which parameterize a broad class of errors by their
tail probability), and random updates. In this framework we derive convergence
results in mean and in high probability, by providing bounds to the distance of
the current iteration from a solution of the optimization or learning problem.
The contributions are discussed in light of federated learning applications
Online Distributed Learning with Quantized Finite-Time Coordination
In this paper we consider online distributed learning problems. Online
distributed learning refers to the process of training learning models on
distributed data sources. In our setting a set of agents need to cooperatively
train a learning model from streaming data. Differently from federated
learning, the proposed approach does not rely on a central server but only on
peer-to-peer communications among the agents. This approach is often used in
scenarios where data cannot be moved to a centralized location due to privacy,
security, or cost reasons. In order to overcome the absence of a central
server, we propose a distributed algorithm that relies on a quantized,
finite-time coordination protocol to aggregate the locally trained models.
Furthermore, our algorithm allows for the use of stochastic gradients during
local training. Stochastic gradients are computed using a randomly sampled
subset of the local training data, which makes the proposed algorithm more
efficient and scalable than traditional gradient descent. In our paper, we
analyze the performance of the proposed algorithm in terms of the mean distance
from the online solution. Finally, we present numerical results for a logistic
regression task.Comment: To be presented at IEEE CDC'2
ADMM-Tracking Gradient for Distributed Optimization over Asynchronous and Unreliable Networks
In this paper, we propose (i) a novel distributed algorithm for consensus
optimization over networks and (ii) a robust extension tailored to deal with
asynchronous agents and packet losses. The key idea is to achieve dynamic
consensus on (i) the agents' average and (ii) the global descent direction by
iteratively solving an online auxiliary optimization problem through a
distributed implementation of the Alternating Direction Method of Multipliers
(ADMM). Such a mechanism is suitably interlaced with a local proportional
action steering each agent estimate to the solution of the original consensus
optimization problem. First, in the case of ideal networks, by using tools from
system theory, we prove the linear convergence of the scheme with strongly
convex costs. Then, by exploiting the averaging theory, we extend such a first
result to prove that the robust extension of our method preserves linear
convergence in the case of asynchronous agents and packet losses. Further, by
using the notion of Input-to-State Stability, we also guarantee the robustness
of the schemes with respect to additional, generic errors affecting the agents'
updates. Finally, some numerical simulations confirm our theoretical findings
and show that the proposed methods outperform the existing state-of-the-art
distributed methods for consensus optimization
Online Distributed Learning over Random Networks
The recent deployment of multi-agent systems in a wide range of scenarios has
enabled the solution of learning problems in a distributed fashion. In this
context, agents are tasked with collecting local data and then cooperatively
train a model, without directly sharing the data. While distributed learning
offers the advantage of preserving agents' privacy, it also poses several
challenges in terms of designing and analyzing suitable algorithms. This work
focuses specifically on the following challenges motivated by practical
implementation: (i) online learning, where the local data change over time;
(ii) asynchronous agent computations; (iii) unreliable and limited
communications; and (iv) inexact local computations. To tackle these
challenges, we introduce the Distributed Operator Theoretical (DOT) version of
the Alternating Direction Method of Multipliers (ADMM), which we call the
DOT-ADMM Algorithm. We prove that it converges with a linear rate for a large
class of convex learning problems (e.g., linear and logistic regression
problems) toward a bounded neighborhood of the optimal time-varying solution,
and characterize how the neighborhood depends on~. We
corroborate the theoretical analysis with numerical simulations comparing the
DOT-ADMM Algorithm with other state-of-the-art algorithms, showing that only
the proposed algorithm exhibits robustness to (i)--(iv)
Distributed Prediction-Correction ADMM for Time-Varying Convex Optimization
This paper introduces a dual-regularized ADMM approach to distributed,
time-varying optimization. The proposed algorithm is designed in a
prediction-correction framework, in which the computing nodes predict the
future local costs based on past observations, and exploit this information to
solve the time-varying problem more effectively. In order to guarantee linear
convergence of the algorithm, a regularization is applied to the dual, yielding
a dual-regularized ADMM. We analyze the convergence properties of the
time-varying algorithm, as well as the regularization error of the
dual-regularized ADMM. Numerical results show that in time-varying settings,
despite the regularization error, the performance of the dual-regularized ADMM
can outperform inexact gradient-based methods, as well as exact dual
decomposition techniques, in terms of asymptotical error and consensus
constraint violation.Comment: Presented at Asilomar Conference on Signals, Systems, and Computers
202
Teoria degli operatori per ottimizzazione e learning
Optimization is an important tool in many science and engineering applications, ranging from machine learning to control, from signal processing to power systems management, to name a few. The relevance of optimization in this wide range of applications requires the design of algorithms that can overcome different challenges. Indeed, depending on the application, optimization algorithms may have access to limited computational power, or have hard constraints on the time available for computations.
To address these challenges, optimization research in recent years has increasingly relied on operator theory, with its powerful tools and results. Importantly, this is allowed by the fact that optimization algorithms can be interpreted as the recursive application of an operator, thus translating a minimization problem into a fixed point problem.
The central theme of this thesis is therefore the intersection of optimization and operator theory, and how we can leverage their interplay to solve different challenges and design novel algorithms. We focus in particular on two broad areas of research: stochastic optimization and online optimization.
Stochastic optimization groups all problems in which an algorithms is constrained by the presence of randomness during its execution, e.g. because of additive noise perturbing the computation. In this context, the thesis proposes two contribution. The first is the study of the alternating direction method of multipliers (ADMM) applied to distributed optimization problems when the network is asynchronous and peer-to-peer communications may randomly fail. The second contribution is a framework for stochastic operator theory that allows us to analyze the convergence of a large number of stochastic optimization algorithms in a unified way, with applications in e.g. machine learning. In this framework we derive both mean and high probability convergence guarantees, the latter leveraging the powerful formalism of sub-Weibull random variables.
In online optimization we are faced with the challenge of solving a problem whose cost and constraints change over time, and thus the goal is to track a sequence of optimizers. The first contribution we propose is a general prediction-correction method, which abstracts many online algorithms and can be used to study their convergence. The prediction-correction method is characterized by the fact that past information is used to warm-start the solution of future problems, and we propose a novel polynomial extrapolation-based strategy to do so. Secondly, in the area of learning to optimize we propose a novel approach to accelerate online algorithms for weakly convex problems. The method is based on the concept of operator regression, which learns a faster algorithm from samples of the original one.
Finally, the thesis reports numerical results that showcase the practical applications of the theoretical contributions, and discusses the tvopt Python module implemented for the purpose of prototyping and benchmarking optimization algorithms.Optimization is an important tool in many science and engineering applications, ranging from machine learning to control, from signal processing to power systems management, to name a few. The relevance of optimization in this wide range of applications requires the design of algorithms that can overcome different challenges. Indeed, depending on the application, optimization algorithms may have access to limited computational power, or have hard constraints on the time available for computations.
To address these challenges, optimization research in recent years has increasingly relied on operator theory, with its powerful tools and results. Importantly, this is allowed by the fact that optimization algorithms can be interpreted as the recursive application of an operator, thus translating a minimization problem into a fixed point problem.
The central theme of this thesis is therefore the intersection of optimization and operator theory, and how we can leverage their interplay to solve different challenges and design novel algorithms. We focus in particular on two broad areas of research: stochastic optimization and online optimization.
Stochastic optimization groups all problems in which an algorithms is constrained by the presence of randomness during its execution, e.g. because of additive noise perturbing the computation. In this context, the thesis proposes two contribution. The first is the study of the alternating direction method of multipliers (ADMM) applied to distributed optimization problems when the network is asynchronous and peer-to-peer communications may randomly fail. The second contribution is a framework for stochastic operator theory that allows us to analyze the convergence of a large number of stochastic optimization algorithms in a unified way, with applications in e.g. machine learning. In this framework we derive both mean and high probability convergence guarantees, the latter leveraging the powerful formalism of sub-Weibull random variables.
In online optimization we are faced with the challenge of solving a problem whose cost and constraints change over time, and thus the goal is to track a sequence of optimizers. The first contribution we propose is a general prediction-correction method, which abstracts many online algorithms and can be used to study their convergence. The prediction-correction method is characterized by the fact that past information is used to warm-start the solution of future problems, and we propose a novel polynomial extrapolation-based strategy to do so. Secondly, in the area of learning to optimize we propose a novel approach to accelerate online algorithms for weakly convex problems. The method is based on the concept of operator regression, which learns a faster algorithm from samples of the original one.
Finally, the thesis reports numerical results that showcase the practical applications of the theoretical contributions, and discusses the tvopt Python module implemented for the purpose of prototyping and benchmarking optimization algorithms
tvopt: A Python Framework for Time-Varying Optimization
This paper introduces tvopt, a Python framework for prototyping and
benchmarking time-varying (or online) optimization algorithms. The paper first
describes the theoretical approach that informed the development of tvopt. Then
it discusses the different components of the framework and their use for
modeling and solving time-varying optimization problems. In particular, tvopt
provides functionalities for defining both centralized and distributed online
problems, and a collection of built-in algorithms to solve them, for example
gradient-based methods, ADMM and other splitting methods. Moreover, the
framework implements prediction strategies to improve the accuracy of the
online solvers. The paper then proposes some numerical results on a benchmark
problem and discusses their implementation using tvopt. The code for tvopt is
available at https://github.com/nicola-bastianello/tvopt.Comment: Code available here: https://github.com/nicola-bastianello/tvop